QVAC-20984 feat: add analytic gradchecked backward pass for the CAMPPlus speaker encoder by freddy311082 · Pull Request #61 · tetherto/qvac-ext-lib-whisper.cpp

freddy311082 · 2026-06-19T22:59:13Z

What

Makes the CAMPPlus speaker encoder differentiable for voice-clone enrollment by
adding an analytic, model-free C++ backward pass that returns d(loss)/d(fbank),
validated against the Task 2 finite-difference gradcheck harness. In the
enrollment loop CAMPPlus provides the speaker-similarity loss; the target-WAV
embedding stays forward-only (constant) and only the generated-audio path needs
gradients, so the gradient is the input gradient with the model weights frozen.

Follows the same pattern as the sibling tickets already on master
(#55 text-encoder tail / QVAC-20978, #58 vector estimator / QVAC-20982,
#60 vocoder / QVAC-20983): a pure double reference backward, gradchecked
component-wise, with the op×backend gap documented. Dependencies: Task 2
(QVAC-20979).

Forward-parity anchor

A gradcheck alone is self-referential: it only proves the backward is the exact
derivative of its own forward. To tie that to the real model, a second test
asserts the analytic double forward matches the production scalar forward
(campplus_embed_cpu) on synthetic weights (max_abs ≈ 3e-8, i.e. float-vs-double
rounding only). Building it surfaced that campplus_embed_cpu's fcm_forward
hardcodes the input feature dim to 80, so the production CPU path is only
self-consistent at feat_dim=80 (the parity test uses that). The analytic
backward derives every dimension from feat_dim, so it is geometry-agnostic.

Changes

src/campplus_backward.{h,cpp} — new CampplusBackward class (namespace
cp_grad). Owns the frozen weights and caches per-call activations as state;
public surface is forward(fbank) / backward(d_emb). Channel-major (C, T)
layout mirroring campplus_embed_cpu exactly. Implements the CAMPPlus
primitives and their input-gradients: stride/pad/dilation-aware conv1d/conv2d,
pre-fused affine batch norm, ReLU, sigmoid, time-mean, segment pooling,
statistics pooling (mean + unbiased std), the FCM Conv2d residual block (with
optional shortcut) and the CAMDenseTDNN layer (context-attention gate + dense
concat split).
test/test_campplus_backward.cpp — gradchecks every primitive, the FCM
residual block, the CAM dense-TDNN layer and the full chain (12 checks) against
central finite differences via the Task 2 harness. Always-on unit ctest tier
(no model/fixtures, no-skip policy).
test/test_campplus_backward_parity.cpp — forward parity vs the production
campplus_embed_cpu (see above). Also unit tier.
docs/voiceclone-backward-campplus.md — op×backend gap matrix and
CPU-fallback rationale for enrollment.
CMakeLists.txt — register the test-campplus-backward and
test-campplus-backward-parity targets.

CPU fallback (documented)

SIGMOID, SQRT, MEAN, SUM_ROWS, PAD, REPEAT and CONCAT have no
backward in the vendored ggml, so the enrollment backward cannot use ggml
autodiff on any backend. It is provided as the analytic C++ backward and runs on
CPU (enrollment is offline; the realtime synthesis GPU fast paths are untouched).
See the doc for the full matrix.

Acceptance

Gradcheck green: test-campplus-backward and test-campplus-backward-parity
both pass (2/2 in the unit tier).

…oder Make CAMPPlus differentiable for the voice-clone enrollment loop: an analytic C++ backward returning d(loss)/d(fbank) with frozen weights (target-WAV embedding stays forward-only). Mirrors campplus_embed_cpu in channel-major layout. Covers FCM (Conv2d + residual blocks), TDNN, CAMDenseTDNN blocks (context-attention gate + dense concat), stats pooling and the dense head. Tests (always-on unit tier, model-free): - test-campplus-backward: gradcheck every primitive + full chain vs central finite differences (Task 2 harness). - test-campplus-backward-parity: analytic double forward vs production campplus_embed_cpu on synthetic weights. QVAC-20984

github-actions · 2026-06-19T22:59:23Z

Review Status

Current Status: ❌ PENDING
Approvals so far: none

Pending reviews: Needs 1 Management or Team Lead, and 1 more from Management, Team Lead, or Member.

Address PR #61 review notes (non-blocking): - Parity test now builds CAM blocks with num_layers 2/3/2 (was 1/1/1) so the dense-concat accumulation (layer i enters with C_in + i*growth) is anchored to the production forward, not only to the self-referential full-chain gradcheck. Parity stays green (max_abs ~4.6e-08, max_rel ~8.9e-08). - Document the trust chain in the parity test header and the gap-matrix doc: every campplus_embed caller in the repo (main.cpp, test-campplus, test-voice-embedding) uses the scalar CPU forward, which is validated against the Python reference; campplus_embed_ggml is not wired to any caller yet.

freddy311082 requested review from a team as code owners June 19, 2026 22:59

GustavoA1604 approved these changes Jun 22, 2026

View reviewed changes

Zbig9000 approved these changes Jun 22, 2026

View reviewed changes

GustavoA1604 merged commit f4208d2 into master Jun 22, 2026
70 of 75 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

QVAC-20984 feat: add analytic gradchecked backward pass for the CAMPPlus speaker encoder#61

QVAC-20984 feat: add analytic gradchecked backward pass for the CAMPPlus speaker encoder#61
GustavoA1604 merged 2 commits into
masterfrom
QVAC-20984/ggml-backward-campplus

freddy311082 commented Jun 19, 2026

Uh oh!

github-actions Bot commented Jun 19, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

freddy311082 commented Jun 19, 2026

What

Forward-parity anchor

Changes

CPU fallback (documented)

Acceptance

Uh oh!

github-actions Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review Status

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions Bot commented Jun 19, 2026 •

edited

Loading